Results

Species Richness

Richness Time Series

Table. Summary of richness mean and variability

Long-term mean and standard deviation in MSOM richness, and average of those sd’s. Gmex and Shelf have the highest and lowest long-term averages in species richness (MSOM).
reg mu sd mu_sd
ai 52.7 2.44 5.31
ebs 103.6 5.57 5.31
gmex 142.3 4.27 5.31
goa 95.7 8.51 5.31
neus 122.0 8.24 5.31
newf 65.8 4.37 5.31
sa 102.4 4.04 5.31
shelf 46.3 3.25 5.31
wctri 84.5 7.15 5.31

Figure 1. MSOM richness time series

richness_ts()
**Figure 1.** Time series of MSOM estimates of region richness. Each point is the posterior mean of regional richness in a year. Lines indicate long-term trends from fitted values of linear regression models predicting richness from time.

Figure 1. Time series of MSOM estimates of region richness. Each point is the posterior mean of regional richness in a year. Lines indicate long-term trends from fitted values of linear regression models predicting richness from time.

Figure S1. MSOM - naive scatter

naive_msom_scatter()
**Figure S1.** MSOM richness vs naive richness

Figure S1. MSOM richness vs naive richness

MSOM richness and Naive richness are pretty similar. MSOM richness is probably more accurate, or is at least more conservative b/c it has fewer significant trends. But their similarity should help justify using observed presences/ absences in other analyses.

Manuscript paragraph:
MSOM estimates of richness were greater than Naive estimates, but the two methods produced similar temporal dynamics in richness (Figure S1). Henceforth, we report MSOM richness estimates. The greatest long-term average richness was the Gulf of Mexico (142.3), lowest in the Scotian Shelf (46.3). The inter-region average of long-term standard deviations of richness was 5.3; the Aleutian Islands showed the lowest variability (sd = 2.4), and the Gulf of Alaska was the most variable (sd = 8.5).


Richness Trend

Looking for trends in species richness in both the naive estimates and the MSOM estimates. Using Kendall’s Tau_b, calculated for 1E4 resamplings of the posterior of richness. Tau is also calculated using a method that removes serial correlation to achieve independence of observations so that the p-values are correct.

load("../../pkgBuild/results/rich_naive_trend_kendall.RData")
rich_naive_trend_kendall[reg!="wcann",BH:=p.adjust(taup, method='BH')]
rich_naive_trend_kendall <- rich_naive_trend_kendall[
    reg!="wcann",list(reg=reg, estimate=tau, BH=BH, p.value=taup)
]
load("../../pkgBuild/results/rich_trend_kendall.RData")
rich_trend_kendall[reg!="wcann",BH:=p.adjust(pvalue, method="BH")]
rich_trend_kendall <- rich_trend_kendall[
    reg!="wcann",list(reg=reg, estimate=tau, BH=BH, p.value=pvalue)
]

Table S1. Naive tau

Table S1. Naive richness trends in each region. Estimate is Kendall’s Tau_b, BH is the Benjamini-Hochberg corrected p-value, and p.value is the original p-value.
reg estimate BH p.value
ai -0.061 0.837 0.837
ebs 0.501 0.000 0.000
gmex 0.853 0.000 0.000
goa 0.205 0.405 0.360
neus 0.337 0.010 0.008
newf 0.787 0.000 0.000
sa -0.569 0.000 0.000
shelf 0.705 0.000 0.000
wctri 0.764 0.005 0.003

In the Naive estimates, 7 regions had significant \(\tau_b\).

Table 2. MSOM tau

Table 2. MSOM richness trends in each region. Estimate is Kendall’s Tau_b, BH is the Benjamini-Hochberg corrected p-value, and p.value is the original p-value. Trends are calculated for 1E4 resampled combinations of the richness posterior. For each resampling, independence of observations is achieved before estimating Tau by removing serial correlation from resampled time series. This procedure retains the integrity of p-values.
reg estimate BH p.value
ebs 0.421 0.003 0.001
ai 0.232 0.365 0.325
goa 0.242 0.365 0.284
wctri 0.609 0.042 0.019
gmex 0.098 0.600 0.600
sa -0.218 0.212 0.141
neus 0.191 0.212 0.133
shelf 0.452 0.000 0.000
newf 0.727 0.001 0.000

In the MSOM estimates, 4 regions had significant \(\tau_{b}\).

Overall, ~half the regions show positive trends. No regions show significant negative trends (although SEUS is close, depending on the analysis). It’s a bit surprising how much removing the autocorrelation seemed to impact some of the trends (AI in particular, I think).

Manuscript paragraph:
Estimated slopes for long-term trends (Kendall’s \(\tau_b\)) in richness were positive for most regions. For Naive estimates, all the seven positive \(\tau\) were also significantly different from 0, whereas the two negative \(\tau_b\), Aleutian Islands and Southeast US, were not (Table S2). Long-term trends in MSOM estimates of richness had the same sign as Naive trends, except for Aleutian Islands, but now the only significant \(\tau_b\) were the following four regions: Eastern Bering Sea (\(\tau_b\) = 0.41), West Coast US (\(\tau_b\) = 0.61), Scotian Shelf (\(\tau_b\) = 0.45), and Newfoundland (\(\tau_b\) = 0.72) (Table 1). Richness trends were not significant in the Gulf of Mexico, Gulf of Alaska, and Northeast US, Aleutian Islands, Southeast US.


Colonization and Extinction

Colonization and Extinction Summary

Figure S2. Barplot of categories

categ_barplot()
**Figure S2.** Number of species beloning to the categories of both, neither, colonizer, leaver in each region

Figure S2. Number of species beloning to the categories of both, neither, colonizer, leaver in each region

Number of species in each category in each region.
ai ebs gmex goa neus newf sa shelf wctri
neither 39 64 105 63 56 49 69 27 64
both 6 44 31 17 83 12 35 20 15
colonizer 9 2 8 18 1 10 0 1 12
leaver 1 0 0 0 1 1 0 0 1

It’s the same pattern, whichever way you split it. However, AI is the only region that had more colonizers than both species. An interesting way to think about some of this is that the average sd in richness was 5.314, so when the number of colonizer or leaver species exceed’s that region’s sd, the impact of those categories, which I consider to be dubious, might start being relevant (though it’s not necessarily problematic, nor is this even close to an actual test for the significance of those categories to the trend). EBS and Shelf had significant positive trends in richness and very low numbers in the colonizer category. WCTRI and NEWF had similar numbers in the both and colonizer category.

% Table created by stargazer v.5.2 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu % Date and time: Fri, Mar 31, 2017 - 23:44:49

Figure NotIncluded. Time series of colonizations and extinctions

col_ext_ts()
**Figure NotIncluded.** Number of colonizations (blue) and extinctions (red) over time in each region.

Figure NotIncluded. Number of colonizations (blue) and extinctions (red) over time in each region.

In most regions the differences in colonization and extinction numbers are similar over time. The most obvious exceptions are for the 3 regions that showed large initial spikes in richness; the GOA, GMEX, and AI regions initially have much larger numbers of colonizers than leavers, but this number shrinks rapidly until the two rates are ~equal.

For the regions with significant positive slopes, there is no visually obvious increase in colonizations relative to extinctions over time. Because the colonization and extinction numbers tend to track each other over the long-term, it it would be difficult to attribute the long-term changes in richness to a change in just colonization or extinction rates.

Manuscript paragraph:
A time series of richness can be decomposed into the colonizations and extinctions of individual species over time. We categorized species according to the following colonization extinction patterns: present in all years = neither (536 species), colonized and went extinct = both (263 species), initially absent but present every year after its colonization = colonizer (61 species), initially present but absent every year after its extinction = leaver (4 species). Most regions had the same overall ranking (neither > both > colonizer > leaver), except in the Northeast US where both was the most common and neither second, and in the Aleutian Islands where colonizer was the second most common and both third (Figure S2). In general, changes in richness were not due to permanent departures or introductions of species to the region. Furthermore, colonization and extinction rates did not become more dissimilar over time for any region (Figure NotIncluded). Colonizations were initially greater than extinctions in Aleutian Islands, Gulf of Alaska, and Gulf of Mexico, but this difference disappeared in the latter portion of these time series, as evidenced by these regions’ initially rapid increase in richness that later plateaued. The other regions did not show strong trends in the difference between colonizations and extinctions over time, making it difficult to attribute the long-term trends in richness to changes in just colonization or just extinction rates.


Spatial Clustering of Colonization and Extinction

Heat Maps

Figure 2. Colonization map

ceRate_map(ce="colonization")
**Figure 2.** Maps of long-term averages of colonizations per site per decade for each region: A) E. Bering Sea, B) Gulf of Alaska, C) Aleutian Islands, D) Scotian Shelf, E) West Coast US, F) Newfoundland, G) Gulf of Mexico, H) Northeast US, I) Southeast US. Values of colonization rate were smoothed using a Gaussian kernel smoother. The smoothed colonization rate is indicated by the color bars in each panel; colors are scaled independently for each region.

Figure 2. Maps of long-term averages of colonizations per site per decade for each region: A) E. Bering Sea, B) Gulf of Alaska, C) Aleutian Islands, D) Scotian Shelf, E) West Coast US, F) Newfoundland, G) Gulf of Mexico, H) Northeast US, I) Southeast US. Values of colonization rate were smoothed using a Gaussian kernel smoother. The smoothed colonization rate is indicated by the color bars in each panel; colors are scaled independently for each region.

Figure S3. Extinction map

ceRate_map(ce="extinction")
**Figure S3.** Maps of long-term averages of extinctions per site per decade for each region: A) E. Bering Sea, B) Gulf of Alaska, C) Aleutian Islands, D) Scotian Shelf, E) West Coast US, F) Newfoundland, G) Gulf of Mexico, H) Northeast US, I) Southeast US. Values of extinction rate were smoothed using a Gaussian kernel smoother. The smoothed extinction rate is indicated by the color bars in each panel; colors are scaled independently for each region.

Figure S3. Maps of long-term averages of extinctions per site per decade for each region: A) E. Bering Sea, B) Gulf of Alaska, C) Aleutian Islands, D) Scotian Shelf, E) West Coast US, F) Newfoundland, G) Gulf of Mexico, H) Northeast US, I) Southeast US. Values of extinction rate were smoothed using a Gaussian kernel smoother. The smoothed extinction rate is indicated by the color bars in each panel; colors are scaled independently for each region.

Hotspots can be seen in most regions. Newfoundland also has high values around its edge (as opposed to interior), it seems. NEUS and Gmex show very strong hotspots, and other locations tend to be much much lower. Other regions show more of a continuum.

rel_col_ext_rate <- mapDat[,j={
    map_smooth_col <- spatstat::Smooth(spatstat::ppp(
        x=.SD[,lon], y=.SD[,lat], 
        marks=.SD[,n_spp_col_weighted], window=mapOwin[[reg]]
    ), hmax=1)
    mark_range_col <- range(map_smooth_col, na.rm=TRUE)*10
    
    map_smooth_ext <- spatstat::Smooth(spatstat::ppp(
        x=.SD[,lon], y=.SD[,lat], 
        marks=.SD[,n_spp_ext_weighted], window=mapOwin[[reg]]
    ), hmax=1)
    mark_range_ext <- range(map_smooth_ext, na.rm=TRUE)*10
    ol <- list(
        minval_col=mark_range_col[1], maxval_col=mark_range_col[2], 
        max_o_min_col=do.call("/",as.list(rev(mark_range_col))),
        minval_ext=mark_range_ext[1], maxval_ext=mark_range_ext[2], 
        max_o_min_ext=do.call("/",as.list(rev(mark_range_ext)))
    )
    lapply(ol, function(x)if(is.numeric(x)){signif(x,3)}else{x})
},by=c("reg")]
The colonization and extinction intensity range and max/min ratio, and median among regions. Useful for assessing how big of a difference there is between red and blue for each region.
reg minval_col maxval_col max_o_min_col minval_ext maxval_ext max_o_min_ext
ebs 0.116 0.554 4.79 0.120 0.546 4.56e+00
ai 0.057 0.483 8.53 0.000 0.174 1.37e+09
goa 0.001 0.764 652.00 0.018 0.567 3.24e+01
wctri 0.023 1.160 50.10 0.009 0.951 1.03e+02
gmex 0.508 3.180 6.25 0.073 3.280 4.51e+01
sa 1.080 2.180 2.03 0.758 3.690 4.87e+00
neus 0.106 22.500 213.00 0.087 21.100 2.42e+02
shelf 0.000 1.220 25400.00 0.000 1.050 1.82e+04
newf 0.036 0.582 16.40 0.011 0.612 5.71e+01
MEDIAN 0.057 1.160 16.40 0.018 0.951 5.71e+01

Neighborhoods and Local Moran’s I

Figure S4. Colonization neighborhood

nb_moranI(ce="colonization")
**Figure S4.** Connectivity and local spatial autocorrelation of colonization events in each region. Each site is represented by a point. Points connected by a line are neighbors. For each region, neighbors were determined by first calculating the minimum distance required to allow each site to have at least 1 neighbor. Neighbors of a focal point were then defined as the points within this minimum distance from the focal point. Local spatial autocorrelation is local Moran’s I, significant LMI is indicated by a solid point, the color of which indicates the value of the LMI statistic. The outline is the region boundary used for smoothing in Figure 3 (main text), but does not affect calculations of LMI.

Figure S4. Connectivity and local spatial autocorrelation of colonization events in each region. Each site is represented by a point. Points connected by a line are neighbors. For each region, neighbors were determined by first calculating the minimum distance required to allow each site to have at least 1 neighbor. Neighbors of a focal point were then defined as the points within this minimum distance from the focal point. Local spatial autocorrelation is local Moran’s I, significant LMI is indicated by a solid point, the color of which indicates the value of the LMI statistic. The outline is the region boundary used for smoothing in Figure 3 (main text), but does not affect calculations of LMI.

Figure S5. Extinction neighborhood

nb_moranI(ce="extinction")
**Figure S5.** Connectivity and local spatial autocorrelation of extinction events in each region. Each site is represented by a point. Points connected by a line are neighbors. For each region, neighbors were determined by first calculating the minimum distance required to allow each site to have at least 1 neighbor. Neighbors of a focal point were then defined as the points within this minimum distance from the focal point. Local spatial autocorrelation is local Moran’s I, significant LMI is indicated by a solid point, the color of which indicates the value of the LMI statistic. The outline is the region boundary used for smoothing in Figure 3 (main text), but does not affect calculations of LMI.

Figure S5. Connectivity and local spatial autocorrelation of extinction events in each region. Each site is represented by a point. Points connected by a line are neighbors. For each region, neighbors were determined by first calculating the minimum distance required to allow each site to have at least 1 neighbor. Neighbors of a focal point were then defined as the points within this minimum distance from the focal point. Local spatial autocorrelation is local Moran’s I, significant LMI is indicated by a solid point, the color of which indicates the value of the LMI statistic. The outline is the region boundary used for smoothing in Figure 3 (main text), but does not affect calculations of LMI.


Richness and Geographic Range

Richness and Range Density, Range Size

Looking for spatial footprint that will predict richness. Here I characterize the geographic distribution of a species as its range size. This is tricky for richness because richness is a community level trait, but I’ve characterized range size at the species level. In fact, this “species level” attribute is also potentially dyanmic – a species need not have fixed range size. So there are two levels of averaging – for each species take its long-term average value, then for each community take the average across species.

Note that I’d previously used range density in addition to range size. I’ve dropped the density metric in most places in order to simplify the results, because the range size results were closely related to the range density results.

First posterity, I’ll show a figure relating range size and range density. Then I’ll present the figure of how size can ‘predict’ richness. Finally, I’ll explore models more formally expressing the relationship between richness and size.

Figure S6. Relationship between range size and range density

rangeSizeDens()
**Figure S6**. Range density versus range size. In panel A each point is a species-region-year combination. In panel B, each point is a region-year. Range size is the proportion of sites occupied, range density the tows in occupied sites. The community metrics in B is calculated by take each species' long-term average from A, then taking the average across all species present in the community in a given year. Fitted lines in A are from a loess fit.

Figure S6. Range density versus range size. In panel A each point is a species-region-year combination. In panel B, each point is a region-year. Range size is the proportion of sites occupied, range density the tows in occupied sites. The community metrics in B is calculated by take each species’ long-term average from A, then taking the average across all species present in the community in a given year. Fitted lines in A are from a loess fit.

These plots show that there is a strong relationship between range size and range density. Interestingly, in Figure S6B, the cross-region relationship is negative (if each color had 1 pt), whereas the within-region relationship is positive.

The positive relationship between size and density is not surprising. My interpretation of density is ~population size. Often population size is correlated with range size. I think this is a standard result, but I need to double-check.

Figure 3. Species richness versus geographic range size

rich_geoRange("size", leg=TRUE, legPan=1, panLab=FALSE)
**Figure 3.** Species richness vs geographic range size. Range size is presented as each species' long-term average of the proportion of sites it occupied. Solid lines are linear regressions with MSOM richness as the response and the horizontal axis and an intercept as the predictors.

Figure 3. Species richness vs geographic range size. Range size is presented as each species’ long-term average of the proportion of sites it occupied. Solid lines are linear regressions with MSOM richness as the response and the horizontal axis and an intercept as the predictors.

Range size is a pretty good predictor of species richness. I think I had originally missed the range size relationship b/c I hadn’t done the same aggregating procedure. The interpretation I have is that richness is highest when you have a bunch of rare species.

The goal here is to see if species richness is predicted by the typical range size of community’s constituent species. First I’ll run different types of models just to explore whether this is true, in general (across regions). Then I’ll drill in to each region individually to answer the same question.

Table. Regressions relating richness to range size

# This is a function that'll help summarize model fit and coeffs and parameter significance:  
mod_smry <- function(m, pred_name=c("density","size","time","type","time:type")){
    pred_name <- match.arg(pred_name)
    sc <- sem.coefs(m)
    sc[,c("estimate","p.value")] <- lapply(sc[,c("estimate","p.value")], signif, 4)
    mod_call <- switch(class(m), lmerMod=m@call, lm=m$call)
    mod_call <- as.character(mod_call)[2]
    fits <- sem.model.fits(m)
    fits[,c("estimate","p.value","Marginal","Conditional")] <- lapply(fits[,c("Marginal","Conditional")], signif, 4)
    out <- cbind(
        mod_call = mod_call,
        sc[sc[,"predictor"]==pred_name,],
        fits, 
        AIC=round(AIC(m), getOption("digits"))
    )
    out[,c(
        # "std.error","N",
        "Class","mod_call","predictor","estimate","p.value","Marginal","Conditional","AIC"
    )]
}

# Make a data set that is useful for these regressions (short variable names, etc):  
range_reg <- comm_master[,list(
    reg, year, rich=reg_rich, density=propTow_occ_avg, size=propStrata_avg_ltAvg
)]

# Fit different models to the whole data set
rSize_mods <- list()
rSize_mods[[1]] <- lm(rich ~ size, data=range_reg)
rSize_mods[[2]] <- lm(rich ~ size*reg, data=range_reg)
rSize_mods[[3]] <- lme4::lmer(rich ~ size + (1|reg), data=range_reg)
rSize_mods[[4]] <- lme4::lmer(rich ~ size + (size|reg), data=range_reg)
rich_size_smry <- rbindlist(lapply(rSize_mods, mod_smry, pred_name="size"))

# Fit same model to each region separately 
rSize_reg_mods <- list()
ur <- range_reg[,unique(reg)]
for(r in 1:length(ur)){
    rSize_reg_mods[[r]] <- lm(rich ~ size, data=range_reg[reg==ur[r]])
}
rich_s_reg_smry <- data.table(
    reg=ur, 
    rbind(
        rbindlist(lapply(rSize_reg_mods, mod_smry, pred_name="size"))
    )
)
setkey(rich_s_reg_smry, reg, Class, predictor)

setnames(rich_s_reg_smry, old=c("Marginal","Conditional","p.value"), new=c("MargR2","CondR2","pval"))
setkey(rich_s_reg_smry, reg, Class, mod_call, predictor)
rich_s_reg_smry[, BH:=round(p.adjust(pval, "BH"), 3), by=c("mod_call", "predictor")]
rich_s_reg_smry <- dcast(rich_s_reg_smry, reg+Class+mod_call+MargR2+CondR2+AIC~predictor, value.var=c("pval", "BH"))
Summary of regression models involving richness and community metric of range size.
Class mod_call predictor estimate p.value Marginal Conditional AIC
lm rich ~ size size -41.7 0.124 0.012 NA 1926
lm rich ~ size * reg size -298.7 0.006 0.995 NA 927
lmerMod rich ~ size + (1 | reg) size -320.5 0.000 0.293 0.995 1133
lmerMod rich ~ size + (size | reg) size -446.7 0.000 0.332 0.999 1005
Regression models of richness predicted by community metric of range size, but each region has a separate model.
reg Class mod_call MargR2 CondR2 AIC pval_size BH_size
ai lm rich ~ size 0.680 NA 46.8 0.001 0.001
ebs lm rich ~ size 0.863 NA 137.9 0.000 0.000
gmex lm rich ~ size 0.767 NA 77.8 0.000 0.000
goa lm rich ~ size 0.802 NA 76.5 0.000 0.000
neus lm rich ~ size 0.852 NA 169.6 0.000 0.000
newf lm rich ~ size 0.868 NA 65.1 0.000 0.000
sa lm rich ~ size 0.361 NA 134.5 0.002 0.002
shelf lm rich ~ size 0.898 NA 124.3 0.000 0.000
wctri lm rich ~ size 0.957 NA 41.1 0.000 0.000
Average of above region-specific rich ~ size models.
mod_call reg Class MargR2 CondR2 AIC pval_size BH_size
rich ~ size NA NA 0.783 NA 97.1 0 0

All models are pretty good predictors. Well, the most basic model kinda sucks I guess. It needs to account for some of the between-region variation.

Predicting Richness: Range Size or Density?

As far as picking one or the other, it doesn’t end up mattering much. Range size is a lot better than density in NEUS, and density outperforms size in AI. Otherwise, size as a slight edge over density on average, although both predictors are significant in all regions.

Conclusion for Richness and Geographic Range

The two metrics of geographic range are well correlated. Furthermore, richness can be predicted pretty well using regressions with either as a predictor. There are large differences among regions, though. This is probably because richness is not readily comparable among most regions. Regions vary mostly in their intercept values, and they have fairly similar slopes (though they are not identical, and model fits improve when allowing slopes to vary among regions; it’s just that the improvement is small compared to allowing intercepts to vary among regions).

The interpretation of the result that geographic distribution predicts species richness is likely associated with species rarity. When the average range density or range size of a community is low, it means it has a lot of species that are rare (at either spatial scale). It’s these rare species that come and go, and form the dynamics of richness that we observe. When that dynamical value is high, it implies that an above-average number of the dynamic species are present. Because those species are transient (dynamic), they are also probably rare.


Geographic Range and Colonizations and Extinctions

Total Colonizations vs Extinctions and Geographic Range

The result that richness is predicted by geographic range implied an underlying association between range, colonization/ extinction, and richness itself. Above, richness was explained with range. Later, the results will explain how range changes near a colonization/ extinction event. Here, between the two, the results show how the number of colonizations is related to range.

Figure S7. Total Colonizations/ Extinctions vs Georaphic Range

ceEventRange("mean_size")
**Figure S7.** Number of colonizations and extinctions as a function of range size and range density.

Figure S7. Number of colonizations and extinctions as a function of range size and range density.

Yup, this is definitely a thing. Long-term average range size and range density predict how many colonizations and extinctions a species is likely to have. This will lead nicely into examing how range changes prior to an extinction or after a colonization.

Range Change after Colonization/ before Extinction

Figure 4. Changes in Geographic Range before Extinction and after Colonization

rangeSize_absenceTime("rangeSize")
**Figure 4.** Geographic range size vs years until extinction (A) and years after colonization (B). For visualization purposes, range size is averaged across species for each unique value on each axis, and a linear model fit through this average. Statistics in main text use unaggregated data. The horizontal axes were formulated as time until (since) the nearest upcoming (previous) absence. Because range size must be zero when either horizontal axis has a value of zero, points at (0,0) were excluded from figures and analyses.

Figure 4. Geographic range size vs years until extinction (A) and years after colonization (B). For visualization purposes, range size is averaged across species for each unique value on each axis, and a linear model fit through this average. Statistics in main text use unaggregated data. The horizontal axes were formulated as time until (since) the nearest upcoming (previous) absence. Because range size must be zero when either horizontal axis has a value of zero, points at (0,0) were excluded from figures and analyses.

Figure 4b. Changes in Geographic Range before Extinction and after Colonization

rangeSize_absenceTime("rangeDensity")
**Figure 4b.** Geographic range density vs years until extinction (A) and years after colonization (B). For visualization purposes, range density is averaged across species for each unique value on each axis, and a linear model fit through this average. Statistics in main text use unaggregated data. The horizontal axes were formulated as time until (since) the nearest upcoming (previous) absence. Because range density must be zero when either horizontal axis has a value of zero, points at (0,0) were excluded from figures and analyses.

Figure 4b. Geographic range density vs years until extinction (A) and years after colonization (B). For visualization purposes, range density is averaged across species for each unique value on each axis, and a linear model fit through this average. Statistics in main text use unaggregated data. The horizontal axes were formulated as time until (since) the nearest upcoming (previous) absence. Because range density must be zero when either horizontal axis has a value of zero, points at (0,0) were excluded from figures and analyses.

Range size declines near an absence much more consistently than does range density; both are (relatively) low just before extinction and just after colonization. However, range density has much more variable intercepts among regions, whereas range size does not.

This makes sense, at least somewhat, because colonization and extinction events are defined at the site level; though the outcome isn’t necessitated by this formulation, because size could drop suddenly. In fact, when a species is absent, both its range size and its range density must be 0 (though, range density is technically calculated for only those sites that are occupied, so I supposed it’s technically undefined according to the equations I’m using).

I think the regressions for range size should omit an intercept, while the regressions for range density should have it. This might be hard to justify fully a priori (though see my thinking in previous paragraph), so I’ll probably just do a model selection and maybe discuess the difference if one model has an intercept and the other does not.

Table. Regressions w/ regions pooled relating time until extinction or after colonization to geographic range

# handy data set for regressions
rangeTimeDT <-  spp_master[!is.na(stretch_type) & propStrata!=0]
rangeTimeDT <- rangeTimeDT[,list(
    reg=reg, 
    event=as.character(event_year), 
    spp=spp, 
    type=as.character(stretch_type),
    time=ext_dist, 
    size=propStrata, 
    density=propTow_occ
)]

# summary function used in tables
mod_smry2 <- function(m){
    mod_call <- switch(class(m), lmerMod=m@call, lm=m$call)
    mod_call <- as.character(mod_call)[2]
    fits <- sem.model.fits(m)
    fits[,c("estimate","p.value","Marginal","Conditional")] <- lapply(fits[,c("Marginal","Conditional")], signif, 4)
    anova_sum <- car::Anova(m)
    
    out <- data.frame(
        mod_call = mod_call,
        data.frame(predictor=rownames(anova_sum), anova_sum)[,c("predictor","Pr..Chisq.")],
        fits, 
        AIC=AIC(m)#signif(AIC(m), getOption("digits"))
    )
    out[,c("Class","mod_call", "predictor", "Pr..Chisq.","Marginal","Conditional","AIC")]
}
# models for range size
sizeCE_mods <- list()
sizeCE_mods[[1]] <- lme4::lmer(size ~ time + (time|spp/reg), data=rangeTimeDT)
# sizeCE_mods[[2]] <- lme4::lmer(size ~ time + (time|spp/reg) + (1|type/reg), data=rangeTimeDT) # error
sizeCE_mods[[2]] <- lme4::lmer(size ~ time*type + (time|spp/reg), data=rangeTimeDT)
# lmerTest::step(sizeCE_mods[[2]]) # doing most complicated throws an error
Mixed effect models predicting range size (size) from the temporal proximity (time) after colonization and/or until extinction (factor name = type). Each model uses data from all regions (reg) and all species (spp), and includes both time to extinction and time from colonization (type).
Dependent variable:
size
(1) (2)
time 0.007*** 0.007***
typepre_ext 0.0004
time:typepre_ext 0.0002
Constant 0.062*** 0.061***
Observations 5,726 5,726
Log Likelihood 7,292.000 7,281.000
Akaike Inf. Crit. -14,566.000 -14,539.000
Bayesian Inf. Crit. -14,506.000 -14,466.000
Note: p<0.1; p<0.05; p<0.01
Same as above, but shows slightly different metrics
Class mod_call predictor pval MargR2 CondR2 AIC
lmerMod size ~ time + (time | spp/reg) time 0.000 0.124 0.84 -14566
lmerMod size ~ time * type + (time | spp/reg) time 0.000 0.123 0.84 -14539
lmerMod size ~ time * type + (time | spp/reg) type 0.570 0.123 0.84 -14539
lmerMod size ~ time * type + (time | spp/reg) time:type 0.723 0.123 0.84 -14539

Range size seems to be important to include type as a fixed effect. It does not need an interaction between time*type. I did additional testing beyond what’s presenting here, and I can confirm that having spp as a random factor is useful, too.

Table. Regressions separate regions – range size vs time until, SIMPLE

getCoefs <- function(mList){
    outList <- list()
    for(r in 1:length(ur)){
        outList[[ur[r]]] <- rbindlist(lapply(
            coef(mList[[r]]), function(x)as.list(colMeans(x))
        ), idcol=TRUE)
        setnames(outList[[ur[r]]], c(".id"), c("randomGroup"))
        mod_call <- switch(class(mList[[r]]), lmerMod=mList[[r]]@call, lm=mList[[r]]$call)
        mod_call <- as.character(mod_call)[2]
        outList[[ur[r]]][,mod_call:=mod_call]
    }
    outList <- rbindlist(outList, idcol=TRUE)
    setnames(outList, c(".id"), c("reg"))
    # setcolorder(outList, c("reg", "mod_call", "randomGroup", "(Intercept)", "time", "typepre_ext"))
    outList
}

sTime_reg_mods3 <- list()
ur <- range_reg[,unique(reg)]
for(r in 1:length(ur)){
    sTime_reg_mods3[[r]] <- lme4::lmer(size ~ time + (time|spp), data=rangeTimeDT[reg==ur[r]])
}
sTime_reg_smry3 <- data.table(rbind(
    rbindlist(structure(lapply(sTime_reg_mods3, mod_smry2), .Names=ur), idcol=TRUE)
))
setnames(sTime_reg_smry3, old=c(".id","Pr..Chisq.","Marginal","Conditional"), new=c('reg',"pval","MargR2","CondR2"))
setkey(sTime_reg_smry3, reg, Class, mod_call, predictor)
sTime_reg_smry3[, BH:=round(p.adjust(pval, "BH"), 3), by=c("mod_call", "predictor")]# adjust p-values for multiple tests
sTime_reg_smry3 <- dcast(sTime_reg_smry3, reg+Class+mod_call+MargR2+CondR2+AIC~predictor, value.var=c("pval", "BH"))# rearrange so each model on 1 line

modCoef3 <- getCoefs(sTime_reg_mods3)
sTime_reg_smry3 <- merge(sTime_reg_smry3, modCoef3, by=c("reg","mod_call"))
setnames(sTime_reg_smry3, old=c("(Intercept)"), new=c("Int"))
Summary statistics for fits of predicting range SIZE from years before extinction and years after colonization. These are mixed effect models of the form size ~ time + (time | spp)
reg MargR2 CondR2 AIC pval_time BH_time Int time
ai 0.089 0.708 -168 0.002 0.002 0.135 0.007
ebs 0.070 0.868 -3334 0.000 0.000 0.058 0.006
gmex 0.099 0.705 -1021 0.000 0.000 0.091 0.012
goa 0.084 0.683 -799 0.000 0.000 0.076 0.005
neus 0.129 0.734 -7726 0.000 0.000 0.029 0.003
newf 0.084 0.868 -565 0.015 0.015 0.064 0.011
sa 0.198 0.608 -1195 0.000 0.000 0.092 0.013
shelf 0.226 0.670 -1954 0.000 0.000 0.063 0.005
wctri 0.184 0.907 -288 0.000 0.000 0.022 0.016

Table. Regressions separate regions – geographic range vs time until extinction or after colonization

# Fit same model to each region separately 
sTime_reg_mods <- list()
ur <- range_reg[,unique(reg)]
for(r in 1:length(ur)){
    sTime_reg_mods[[r]] <- lme4::lmer(size ~ time + type + (time|spp), data=rangeTimeDT[reg==ur[r]])
}
sTime_reg_smry <- data.table(rbind(
    rbindlist(structure(lapply(sTime_reg_mods, mod_smry2), .Names=ur), idcol=TRUE)
))
setnames(sTime_reg_smry, old=c(".id","Pr..Chisq.","Marginal","Conditional"), new=c('reg',"pval","MargR2","CondR2"))
setkey(sTime_reg_smry, reg, Class, mod_call, predictor)
sTime_reg_smry[, BH:=round(p.adjust(pval, "BH"), 3), by=c("mod_call", "predictor")]# adjust p-values for multiple tests
sTime_reg_smry <- dcast(sTime_reg_smry, reg+Class+mod_call+MargR2+CondR2+AIC~predictor, value.var=c("pval", "BH"))# rearrange so each model on 1 line

# add coefficients to the sTime_reg_smry data.table
modCoef <- getCoefs(sTime_reg_mods)
sTime_reg_smry <- merge(sTime_reg_smry, modCoef, by=c("reg","mod_call"))
setnames(sTime_reg_smry, old=c("(Intercept)"), new=c("Int"))
Summary statistics for fits of predicting range SIZE from years before extinction and years after colonization. These are mixed effect models of the form size ~ time + type + (time | spp)
reg MargR2 CondR2 AIC pval_time pval_type BH_time BH_type Int time typepre_ext
ai 0.090 0.703 -161 0.002 0.559 0.002 0.718 0.129 0.007 0.021
ebs 0.070 0.868 -3323 0.000 0.669 0.000 0.753 0.058 0.006 0.001
gmex 0.099 0.710 -1014 0.000 0.141 0.000 0.594 0.086 0.012 0.017
goa 0.087 0.684 -791 0.000 0.463 0.000 0.718 0.078 0.005 -0.011
neus 0.132 0.735 -7714 0.000 0.264 0.000 0.594 0.028 0.003 0.002
newf 0.081 0.865 -556 0.015 0.507 0.015 0.718 0.067 0.010 -0.008
sa 0.193 0.604 -1186 0.000 0.220 0.000 0.594 0.088 0.013 0.010
shelf 0.232 0.677 -1957 0.000 0.000 0.000 0.002 0.070 0.004 -0.021
wctri 0.182 0.907 -280 0.000 0.801 0.000 0.801 0.021 0.016 0.005

Table. Regressions separate regions – range size vs time until, WITH INTERACTION

sTime_reg_mods2 <- list()
ur <- range_reg[,unique(reg)]
for(r in 1:length(ur)){
    sTime_reg_mods2[[r]] <- lme4::lmer(size ~ time * type + (time|spp), data=rangeTimeDT[reg==ur[r]])
}
sTime_reg_smry2 <- data.table(rbind(
    rbindlist(structure(lapply(sTime_reg_mods2, mod_smry2), .Names=ur), idcol=TRUE)
))
setnames(sTime_reg_smry2, old=c(".id","Pr..Chisq.","Marginal","Conditional"), new=c('reg',"pval","MargR2","CondR2"))
setkey(sTime_reg_smry2, reg, Class, mod_call, predictor)
sTime_reg_smry2[, BH:=round(p.adjust(pval, "BH"), 3), by=c("mod_call", "predictor")]# adjust p-values for multiple tests
sTime_reg_smry2 <- dcast(sTime_reg_smry2, reg+Class+mod_call+MargR2+CondR2+AIC~predictor, value.var=c("pval", "BH"))# rearrange so each model on 1 line
modCoef2 <- getCoefs(sTime_reg_mods2)
sTime_reg_smry2 <- merge(sTime_reg_smry2, modCoef2, by=c("reg","mod_call"))
setnames(sTime_reg_smry2, old=c("(Intercept)"), new=c("Int"))
Summary statistics for fits of predicting range SIZE from years before extinction and years after colonization. These are mixed effect models of the form size ~ time * type + (time | spp)
reg MargR2 CondR2 AIC pval_time pval_time:type pval_type BH_time BH_time:type BH_type Int time typepre_ext time:typepre_ext
ai 0.117 0.736 -154 0.000 0.028 0.526 0.000 0.085 0.676 0.120 0.008 0.094 -0.013
ebs 0.069 0.868 -3309 0.000 0.574 0.669 0.000 0.738 0.753 0.058 0.005 -0.001 0.001
gmex 0.099 0.714 -1003 0.000 0.675 0.143 0.000 0.759 0.608 0.087 0.011 0.012 0.002
goa 0.083 0.678 -783 0.000 0.060 0.463 0.000 0.108 0.676 0.081 0.004 -0.048 0.008
neus 0.136 0.731 -7713 0.000 0.000 0.270 0.000 0.001 0.608 0.026 0.004 0.007 -0.002
newf 0.083 0.864 -547 0.014 0.491 0.505 0.014 0.736 0.676 0.065 0.011 0.002 -0.006
sa 0.190 0.602 -1175 0.000 0.789 0.221 0.000 0.789 0.608 0.089 0.013 0.008 0.001
shelf 0.220 0.672 -1950 0.000 0.015 0.000 0.000 0.067 0.002 0.072 0.004 -0.037 0.004
wctri 0.228 0.920 -275 0.000 0.040 0.776 0.000 0.090 0.776 0.028 0.014 -0.069 0.021
reg MargR2 CondR2 AIC mod
ai 0.089 0.708 -168 size ~ time + (time | spp)
ai 0.090 0.703 -161 size ~ time + type + (time | spp)
ai 0.117 0.736 -154 size ~ time * type + (time | spp)
ebs 0.070 0.868 -3334 size ~ time + (time | spp)
ebs 0.070 0.868 -3323 size ~ time + type + (time | spp)
ebs 0.069 0.868 -3309 size ~ time * type + (time | spp)
gmex 0.099 0.705 -1021 size ~ time + (time | spp)
gmex 0.099 0.710 -1014 size ~ time + type + (time | spp)
gmex 0.099 0.714 -1003 size ~ time * type + (time | spp)
goa 0.084 0.683 -799 size ~ time + (time | spp)
goa 0.087 0.684 -791 size ~ time + type + (time | spp)
goa 0.083 0.678 -783 size ~ time * type + (time | spp)
neus 0.129 0.734 -7726 size ~ time + (time | spp)
neus 0.132 0.735 -7714 size ~ time + type + (time | spp)
neus 0.136 0.731 -7713 size ~ time * type + (time | spp)
newf 0.084 0.868 -565 size ~ time + (time | spp)
newf 0.081 0.865 -556 size ~ time + type + (time | spp)
newf 0.083 0.864 -547 size ~ time * type + (time | spp)
sa 0.198 0.608 -1195 size ~ time + (time | spp)
sa 0.193 0.604 -1186 size ~ time + type + (time | spp)
sa 0.190 0.602 -1175 size ~ time * type + (time | spp)
shelf 0.226 0.670 -1954 size ~ time + (time | spp)
shelf 0.232 0.677 -1957 size ~ time + type + (time | spp)
shelf 0.220 0.672 -1950 size ~ time * type + (time | spp)
wctri 0.184 0.907 -288 size ~ time + (time | spp)
wctri 0.182 0.907 -280 size ~ time + type + (time | spp)
wctri 0.228 0.920 -275 size ~ time * type + (time | spp)

Conclusion for Range Change before Extinction/ after Colonization

As stated with the larger (pooled regional data) regressions, range size responds more consistently/ strongly to an approaching extinction/ departure from colonization than does range density. Range size shrinks as extinction approaches, increases as time since colonization increases. In both cases, there was only 1 region for which the direction (into extinction, out of colonization) mattered: for range density, it was the Southeast US (effect of pre-extinction direction = 0.037, p=0.043 [BH correction], Table 12), and for range size it was Scotial Shelf (effect = -0.021, p = 0.002). Time was a significant predictor of range size in all regions; time was not a significant predictor of range density in Newfoundland (p = 0.846 [BH]), and the Scotial Shelf (p = 0.916).

These results support the hypothesis that species rarity is closely associated with proximity to extinction/ colonization, and therefore the probability that the species will contribute to a change in species richness. Furthermore, these relationships have not been directly quantified for entire assemblages. Competing hypotheses exist regarding how range should change approaching extinction and after colonization. We find that the change in range is similar regardless of direction. However, spatial scale was important. Although slopes were similar, the intercept for density was far larger than for range size — the fraction of tows containing a species in occupied sites may remain high even when extinction is imminent, and may similar be large even if colonization was recent. Range size, on the other hand, was much closer to 0 near an absence.

Overall, the results suggest that the spatial footprint of individual species is important for understanding changes in species richness. Furthermore, because species contributing most to the dynamics of richness are those that repeatedly colonize and go extinct, it is meaningful to look at a species’ long-term rarity in order to gauge whether it is likely to contribute to those long-term richness changes. Determining what drives the geographic range of individual species is probably a powerful way to anticipate richness changes.

Exploring Range size vs absence time as individual regressions

rangeTimeDT[,nTime:=length(time),by=c("reg","type","event","spp")]
getEPR <- function(x){
    col_select <- c("Estimate","Pr(>|t|)")
    sx <- summary(x)
    EP <- sx$coefficients[2,col_select]
    names(EP) <- c("Estimate","Pr")
    R <- sx$r.squared
    data.table(as.data.table(as.list(EP)), Rsquared=R)
}
o <- rangeTimeDT[nTime>=3,j={
    getEPR(lm(size~time))
    } ,by=c("reg","type","event","nTime","spp")
]

par(mfcol=c(3,2), mar=c(2,2,0.5,0.5), cex=1, ps=10, mgp=c(1,0.1,0), tcl=-0.1)
hist(o[,Estimate])
hist(o[,(Pr)])
hist(o[,(Rsquared)])
setorder(o, nTime)
o[,j={plot(nTime,Pr); lines(spline(Pr~nTime),col='red',lwd=2)}]

NULL

o[,j={plot(nTime,Estimate); ; lines(spline(Estimate~nTime),col='red',lwd=2)}]

NULL

kable(o[,list(propSignificant=(sum(Pr<0.05)/sum(!is.na(Pr))),n=sum(!is.na(Pr))),by=c("reg","type")])
reg type propSignificant n
ai post_col 0.467 15
ebs post_col 0.373 51
ebs pre_ext 0.270 37
gmex post_col 0.297 37
gmex pre_ext 0.083 12
goa post_col 0.343 35
neus pre_ext 0.071 70
neus post_col 0.147 102
newf post_col 0.263 19
newf pre_ext 0.167 6
sa pre_ext 0.189 37
sa post_col 0.114 35
shelf pre_ext 0.200 15
shelf post_col 0.346 26
wctri post_col 0.217 23
wctri pre_ext 0.500 2
ai pre_ext 0.000 2
goa pre_ext 0.250 4
kable(o[,list(propSignificant=(sum(Pr<0.05)/sum(!is.na(Pr))),n=sum(!is.na(Pr))),by=c("reg","type")][,mean(propSignificant),by=c("type")])
type V1
post_col 0.285
pre_ext 0.192
**Exploration Figure** histograms of separate regressions of size ~ time; this is for each run-up to an extinction and reach follow-up to a colonization. Trying to understand how regularly the pattern might be observed. Hard to answer because adjusting the restriction on number of events in the run-up or follow-up (nTime) greatly affects the proportion that are significant.

Exploration Figure histograms of separate regressions of size ~ time; this is for each run-up to an extinction and reach follow-up to a colonization. Trying to understand how regularly the pattern might be observed. Hard to answer because adjusting the restriction on number of events in the run-up or follow-up (nTime) greatly affects the proportion that are significant.

The mixed effect models show a strong tendency for range size to be smaller near an absence. I figured that this relationship wouldn’t be apparent for all cases, which ended up being the result. However, the significance of these relationships depends on the number of years for each event stretch. So sample size matters. On one hand, this could hold implications for the rate of decline, so excluding short stretches might also exclude more cases with abrupt declines/ increases (although, these may well be preceded by long durations of stability, so not all abrupt declines/ increases would be excluded necessarily). However, the slope estimate doesnt change a whole lot across sample size: from 0 to ~10 years, increasing sample size (duration) also increases the slope (going from slightly negative to pretty consistently positive). Then it levels out, and doesn’t continue increasing. So it is not the case that longer stretches are long because they have more gradual slopes, necessarily. Actually, in the range of data for which there does appear to be a trend (0-10 samples), the slope becomes larger which is the opposite of “longer stretches result from more gradual processes” notion (thinking in terms of extinction).

Ultimately, I think the mixed effects model is far more appropriate so long as conclusions are cauched in general patterns, not in being able to detect the pattern for any single species. While it could have been interesting to present the many-regression results too, I think it is a separate analysis to consider how often this rule would apply to individual species. Such an analysis would benefit from understanding when the rule does and does not apply, not just the apparent frequency of relevance.


Supplement

Bin Size

load("../manuscript/manuscript_binSize.RData")
par(mfrow=c(3,3), mar=c(3,3,0.5,3), oma=c(1,1,1,0.5))
for(r in 1:length(regs)){
    image(bin_sites[[r]], axes=FALSE, col=fields::tim.colors())
    mtext(regs[r], side=3, line=0, font=2)
    xl <- as.numeric(rownames(bin_sites[[r]]))
    yl <- as.numeric(colnames(bin_sites[[r]]))
    axis(1, at=seq(0,1, length.out=length(xl)), labels=xl)
    axis(2, at=seq(0,1, length.out=length(yl)), labels=yl)
    if(r==length(regs)){
        mtext("Longitude-latitude bin size (degrees)", side=1, line=0, outer=TRUE)
        mtext("Depth bin size (meters)", side=2, line=-0.5, outer=TRUE)
    }
    fields::image.plot(bin_sites[[r]], axes=FALSE, legend.only=TRUE, graphics.reset=TRUE)
}
**Bin Size** How Bin size (lon, lat, depth) affects the number of sites in a region. The vertical axis is the size of the depth bin in meters, horizontal axis is the size of the lon-lat bin in degrees, and the colors indicate the number of sites in that region that are sampled for at least 85% of years.

Bin Size How Bin size (lon, lat, depth) affects the number of sites in a region. The vertical axis is the size of the depth bin in meters, horizontal axis is the size of the lon-lat bin in degrees, and the colors indicate the number of sites in that region that are sampled for at least 85% of years.

scaled_bin_sites <- lapply(bin_sites, function(x){x2 <- x; x2[,] <- scale(c(x)); x2})
mean_sbs <- apply(simplify2array(scaled_bin_sites), c(1,2), mean, na.rm=TRUE)
min_sbs <- apply(simplify2array(scaled_bin_sites), c(1,2), min, na.rm=TRUE)
par(mfrow=c(2,1), mar=c(4,4,1,5))

image(mean_sbs, axes=FALSE, col=fields::tim.colors())
mtext("cross-region average of z-scores", side=3, line=0, font=2)
xl <- as.numeric(rownames(bin_sites[[1]]))
yl <- as.numeric(colnames(bin_sites[[1]]))
axis(1, at=seq(0,1, length.out=length(xl)), labels=xl)
axis(2, at=seq(0,1, length.out=length(yl)), labels=yl)
mtext("Longitude-latitude bin size (degrees)", side=1, line=2)
mtext("Depth bin size (meters)", side=2, line=2)
fields::image.plot(mean_sbs, axes=FALSE, legend.only=TRUE, graphics.reset=TRUE)

image(min_sbs, axes=FALSE, col=fields::tim.colors())
mtext("cross-region minimum of z-scores", side=3, line=0, font=2)
axis(1, at=seq(0,1, length.out=length(xl)), labels=xl)
axis(2, at=seq(0,1, length.out=length(yl)), labels=yl)
mtext("Longitude-latitude bin size (degrees)", side=1, line=2)
mtext("Depth bin size (meters)", side=2, line=2)
fields::image.plot(min_sbs, axes=FALSE, legend.only=TRUE, graphics.reset=TRUE)
**Bin Size** How Bin size (lon, lat, depth) affects the number of sites in a region. To compare across regions, we can first z-score each region, then take the cross-region average for each bin size combination.

Bin Size How Bin size (lon, lat, depth) affects the number of sites in a region. To compare across regions, we can first z-score each region, then take the cross-region average for each bin size combination.

Tows per site, sites per region

sumry_counts <- data_all[reg!="wcann", 
    j = {
        yi <- table(diff(sort(unique(year))))
        yi_form <- paste0(names(yi), paste("(",yi,")",sep=""))
        data.table(
            "N" = trawlData::lu(year),
            "Years" = paste(min(year),max(year),sep=" - "),
            "Frequency" = paste(yi_form,collapse=", "),
            "Sites" = trawlData::lu(stratum), 
            # .SD[,list("max. Tows"=max(trawlData::lu(haulid)), "Average Tows" = mean(trawlData::lu(haulid))),by=c("stratum","year")]
            "Tows" = paste0(
                .SD[,trawlData::lu(haulid),by=c("stratum","year")][,max(V1)],
                paste0("(",.SD[,trawlData::lu(haulid),by=c("stratum","year")][,round(mean(V1),1)],")")
                ),
            # "Max. Tows" = .SD[,trawlData::lu(haulid),by=c("stratum","year")][,max(V1)],
#           "Avg Tows" = .SD[,trawlData::lu(haulid),by=c("stratum","year")][,mean(V1)],
            "Species" = trawlData::lu(spp)
        )
    }, 
    by=c('reg')
]
setnames(sumry_counts, 'reg', "Region")
sumry_counts[,Region:=factor(Region, levels=c("ebs","ai","goa","wctri","gmex","sa","neus","shelf","newf"), labels=c("E. Bering Sea","Aleutian Islands","Gulf of Alaska","West Coast US","Gulf of Mexico","Southeast US", "Northeast US", "Scotian Shelf","Newfoundland"))]
setkey(sumry_counts, Region)

sumry_counts[,Org:=c("AFSC","AFSC","AFSC","NMFS","GSMFC","SCDNR","NEFSC","DFO","DFO")]
# sumry_counts[,"Avg Tows":=round(eval(s2c("Avg Tows"))[[1]],1)]
stargazer(sumry_counts, summary=FALSE, rownames=FALSE, column.sep.width="2pt", digits.extra=0, digits=NA)
% Table created by stargazer v.5.2 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu % Date and time: Fri, Mar 31, 2017 - 23:45:27
# \begin{table}[!htbp] \centering
# %  \caption{}
# %  \label{}
# \begin{tabular}{@{\extracolsep{2pt}} cccccccc}
# \\[-1.8ex]\hline
# \hline \\[-1.8ex]
# Region & N & Years & Frequency & Sites & Tows & Species & Org \\
# \hline \\[-1.8ex]
# E. Bering Sea & $31$ & 1984 - 2014 & 1(30) & $138$ & 4(1.6) & $110$ & $\text{AFSC}^{*}$ \\
# Aleutian Islands & $12$ & 1983 - 2014 & 2(5), 3(4), 4(1), 5(1) & $82$ & 12(2.8) & $55$ & $\text{AFSC}^{*}$ \\
# Gulf of Alaska & $13$ & 1984 - 2013 & 2(7), 3(5) & $89$ & 15(4.4) & $98$ & $\text{AFSC}^{*}$ \\
# West Coast US & $10$ & 1977 - 2004 & 3(9) & $84$ & 91(4.7) & $92$ & $\text{NMFS}^{\dagger}$ \\
# Gulf of Mexico & $17$ & 1984 - 2000 & 1(16) & $39$ & 39(5.7) & $144$ & $\text{GSMFC}^{\ddagger}$ \\
# Southeast US & $25$ & 1990 - 2014 & 1(24) & $24$ & 13(3.9) & $104$ & $\text{SCDNR}^{\S}$ \\
# Northeast US & $32$ & 1982 - 2013 & 1(31) & $100$ & 10(3) & $141$ & $\text{NEFSC}^{\P}$ \\
# Scotian Shelf & $41$ & 1970 - 2010 & 1(40) & $48$ & 11(2.5) & $48$ &$\text{DFO}^{\nabla}$ \\
# Newfoundland & $16$ & 1996 - 2011 & 1(15) & $191$ & 9(2.2) & $72$ & $\text{DFO}^{\nabla}$ \\
# \hline \\[-1.8ex]
# \end{tabular}
# \end{table}

# knitr::kable(sumry_counts, caption="Years = Total number of years sampled, Year Range = the minimum and maximum of years sampled, Year Interval = the number of years elapsed between samples (and the frequency of this interval in parentheses), Sites = the total number of sites in the region, Max. Tows = maximum number of tows per site per year, Average Tows = the average number of tows per site per year, Total Species = the total number of species observed in the region across all sites and years.")

Years Excluded & Strata Sampled

reg_depthStratum <- c(
    "ebs" = 500,
    "ai" = 100,
    "goa" = 500,
    "wctri" = 100, 
    "wcann" = 500, 
    "gmex" = 500, 
    "sa" = 500, 
    "neus" = 500, 
    "shelf" = 500, 
    "newf" = 500
)

reg_tolFraction <- c(
    "ebs" = 0,
    "ai" = 0.15,
    "goa" = 0,
    "wctri" = 0.15, 
    "wcann" = 0.15, 
    "gmex" = 0.15, 
    "sa" = 0.15, 
    "neus" = 0.15, 
    "shelf" = 0.15, 
    "newf" = 0.15
)
regs <- names(reg_tolFraction)

yr_subs <- list(
    ebs = bquote((year)>1983),
    ai = NA,
    goa = NA,
    gmex = bquote((year)>1983 & (year)<2001),
    neus = bquote((year)>1981 & (year)<2014),
    newf = bquote((year)>1995),
    sa = bquote((year)>=1990),
    shelf = bquote((year)!=2011),
    wctri = NA
)
yr_ablin <- list(
    ebs = 1983,
    ai = NA,
    goa = NA,
    gmex = c(1983,2001),
    neus = c(1981,2014),
    newf = 1995,
    sa = 1989,
    shelf = 2011,
    wctri = NA
)
yregs <- names(yr_subs)
par(mfrow=c(3, 3), mar=c(2,2,1,0.25), cex=1, mgp=c(1,0.15,0), tcl=-0.15, ps=8)
for(r in 1:length(yregs)){
    
    b <- trim_msom(yregs[r], gridSize=0.5, grid_stratum=TRUE, depthStratum=reg_depthStratum[yregs[r]], tolFraction=0.5, plot=FALSE, cull_show_up=FALSE, trimYears=FALSE)
    
    b_tbl <- b[,table(year,stratum)>0]
    b_tbl <- b_tbl[,order(colSums(b_tbl))]
    image(b_tbl, axes=FALSE)
    at_vals <- seq(0,1,length.out=nrow(b_tbl))
    axis(1, at=at_vals, labels=rownames(b_tbl))
    abline(v=at_vals[rownames(b_tbl)%in%yr_ablin[[r]]])
    mtext(yregs[r], side=3, line=0, adj=0, font=2)
}
**Trimming Years b/c of Strata, Counts of Analyzed and Excluded Strata** Along the vertical axis is a stratum ID. Along the horizontal axis is a year of sampling. The colors are binary: the light color indicates that the stratum was sampled in that year, and the red color indicates that the stratum was not sampled. The vertical line indicates that years at the line or to the left (if the line is on the left half of graph) or to the right (if line is on right half of graph) were excluded from the analysis, inclusively. E.g., for E. Bering Sea (ebs), 1982 and 1983 were excluded. Three regions had no years excluded (ai = Aleutian Islands, goa = Gulf of Alaska, wctri West Coast US).  The last three regions had years excluded because of changes in the number or location of strata sampled: gmex = Gulf of Mexico 1983 < years < 2001; newf = Newfoundland 1995 < years; sa = Southeast US 1989 < years. Three regions had years excluded for reasons other than number of strata sampled: ebs = E. Bering Sea years < 1984, years excluded due to massive early increase in number of species sampled each year, change in identification suspected; neus = Northeast US 1981 < years < 2014; shelf = Scotian Shelf years < 2011, bottom temperature not available in final year (used as covariate in occupancy model). Finally, note that in GoA strata had to be present in 100% of years or else they were not included; this was done b/c in 2001 the western-most stratum that was sampled was much further to the east than in other years. Similarly, in E. Bering Sea strata had to be present in all years, otherwise the Northeastern extent of the sampled range was reduced substantially in several years.

Trimming Years b/c of Strata, Counts of Analyzed and Excluded Strata Along the vertical axis is a stratum ID. Along the horizontal axis is a year of sampling. The colors are binary: the light color indicates that the stratum was sampled in that year, and the red color indicates that the stratum was not sampled. The vertical line indicates that years at the line or to the left (if the line is on the left half of graph) or to the right (if line is on right half of graph) were excluded from the analysis, inclusively. E.g., for E. Bering Sea (ebs), 1982 and 1983 were excluded. Three regions had no years excluded (ai = Aleutian Islands, goa = Gulf of Alaska, wctri West Coast US). The last three regions had years excluded because of changes in the number or location of strata sampled: gmex = Gulf of Mexico 1983 < years < 2001; newf = Newfoundland 1995 < years; sa = Southeast US 1989 < years. Three regions had years excluded for reasons other than number of strata sampled: ebs = E. Bering Sea years < 1984, years excluded due to massive early increase in number of species sampled each year, change in identification suspected; neus = Northeast US 1981 < years < 2014; shelf = Scotian Shelf years < 2011, bottom temperature not available in final year (used as covariate in occupancy model). Finally, note that in GoA strata had to be present in 100% of years or else they were not included; this was done b/c in 2001 the western-most stratum that was sampled was much further to the east than in other years. Similarly, in E. Bering Sea strata had to be present in all years, otherwise the Northeastern extent of the sampled range was reduced substantially in several years.

Years Excluded & Strata Excluded, Part 2

par(mfrow=c(9, 5), mar=c(2,2,1,0.25), cex=1, mgp=c(1,0.15,0), tcl=-0.15, ps=8)
for(r in 1:length(yregs)){
    
    b <- trim_msom(yregs[r], gridSize=0.5, grid_stratum=TRUE, depthStratum=reg_depthStratum[yregs[r]], tolFraction=0.15, plot=FALSE, cull_show_up=FALSE, trimYears=FALSE)
    plot(b[,list("# strata sampled"=trawlData::lu(stratum)),by=c("year")])
    mtext(yregs[r], side=3, line=0, adj=0, font=2)
    abline(v=yr_ablin[[r]])
    plot(b[,list("min latitude"=min(lat)),by=c("year")])
    mtext(yregs[r], side=3, line=0, adj=0, font=2)
    abline(v=yr_ablin[[r]])
    plot(b[,list("max latitude"=max(lat)),by=c("year")])
    mtext(yregs[r], side=3, line=0, adj=0, font=2)
    abline(v=yr_ablin[[r]])
    plot(b[,list("min longitude"=min(lon)),by=c("year")])
    mtext(yregs[r], side=3, line=0, adj=0, font=2)
    abline(v=yr_ablin[[r]])
    plot(b[,list("max longitude"=max(lon)),by=c("year")])
    mtext(yregs[r], side=3, line=0, adj=0, font=2)
    abline(v=yr_ablin[[r]])
}
**Trimming Years b/c of Stratum Locations, Stats for Analyzed and Excluded Strata**  The number and coordinate extreme of strata in each region if all years had been included in the analysis, and if strata were only removed if they were absent in more than 15% of years. Note that the 15%  tolerance rule used here can differ from the implemented rule in two ways: 1) for regions with years excluded, the increased number of years may change whether or not a stratum meets the 15% cutoff; 2) two of the regions (E Bering Sea and Gulf of Alaska) had a 0% tolerance, therefore the same strata were sampled in each year for these regions (though the changes in coordinate extrema and stratum count illustrate the need for the unique tolerance rule in these regions).

Trimming Years b/c of Stratum Locations, Stats for Analyzed and Excluded Strata The number and coordinate extreme of strata in each region if all years had been included in the analysis, and if strata were only removed if they were absent in more than 15% of years. Note that the 15% tolerance rule used here can differ from the implemented rule in two ways: 1) for regions with years excluded, the increased number of years may change whether or not a stratum meets the 15% cutoff; 2) two of the regions (E Bering Sea and Gulf of Alaska) had a 0% tolerance, therefore the same strata were sampled in each year for these regions (though the changes in coordinate extrema and stratum count illustrate the need for the unique tolerance rule in these regions).

par(mfrow=c(9, 5), mar=c(2,2,1,0.25), cex=1, mgp=c(1,0.15,0), tcl=-0.15, ps=8)
for(r in 1:length(yregs)){
    
    # b <- trim_msom(yregs[r], gridSize=0.5, depthStratum=reg_depthStratum[yregs[r]], tolFraction=reg_tolFraction[yregs[r]], grid_stratum=TRUE, plot=FALSE, cull_show_up=FALSE)
    b <- data_all[reg==yregs[r]]
    
    plot(b[,list("# strata sampled"=trawlData::lu(stratum)),by=c("year")])
    mtext(yregs[r], side=3, line=0, adj=0, font=2)
    abline(v=yr_ablin[[r]])
    plot(b[,list("min latitude"=min(lat)),by=c("year")])
    mtext(yregs[r], side=3, line=0, adj=0, font=2)
    abline(v=yr_ablin[[r]])
    plot(b[,list("max latitude"=max(lat)),by=c("year")])
    mtext(yregs[r], side=3, line=0, adj=0, font=2)
    abline(v=yr_ablin[[r]])
    plot(b[,list("min longitude"=min(lon)),by=c("year")])
    mtext(yregs[r], side=3, line=0, adj=0, font=2)
    abline(v=yr_ablin[[r]])
    plot(b[,list("max longitude"=max(lon)),by=c("year")])
    mtext(yregs[r], side=3, line=0, adj=0, font=2)
    abline(v=yr_ablin[[r]])
}
**Trimming Years b/c of Stratum Locations, Stats for the Analyzed Strata** Changes in stratum statistics (number of strata, min/max lon/lat) for strata included in results in paper.

Trimming Years b/c of Stratum Locations, Stats for the Analyzed Strata Changes in stratum statistics (number of strata, min/max lon/lat) for strata included in results in paper.


Revision Exploration

Are rare species becoming more common?

blah <- spp_master[present==1, j={
    # if(any(col==1) | any(ext==1)){
    #   list(propSlope=NA_real_, mu_propStrat=mean(propStrata))
    # }else{
    #   list(propSlope=summary(lm(propStrata~year))$coeff[2,1], mu_propStrat=mean(propStrata))
    # }
    list(propSlope=summary(lm(propStrata~year))$coeff[2,1], mu_propStrat=mean(propStrata), ce_categ=ce_categ[1])
} ,by=c("reg","spp")]

# par(mfrow=c(3,3))
# blah[,plot(mu_propStrat, propSlope, main=reg[1]),by='reg']

par(mfrow=c(3,3))
ureg <- blah[,unique(reg)]
nreg <- length(ureg)
for(r in 1:nreg){
    blah[reg==ureg[r],j={boxplot(propSlope~ce_categ, main=reg[1], outline=FALSE);abline(h=0);NULL}]
}

I wanted to determine if species in the region are becoming more widespread over time. In other words, when the species is present, what fraction of the region does it occupy? Does that fraction change over time? This slope is the y-axis for the boxplots. It indicates the long-term slope of the proportion of sites occupied (excluding any year when that proportion was 0). The x-axis represents the categories of colonization-extinction patterns. Species in the “both” category experienced both colonization and extinction events at some point in the time series.

Other figures and analyses suggest that species in the “both” category play an important role in increases in species richness. In other words, the coming-and-going of these organisms is not neutral in the long term. This conjecture is supported by 1) the magnitude of increase for regions with increasing richness cannot be explained by “colonizing” species alone; 2) geographically constrained species have more colonizations and extinctions that widespread species.

Looking at the different box categories, if historically rare species (i.e., those that were in the both category and had most of the colonizations/ extinctions) were appearing more consistently in the time series because they were becoming more geogrpahically widespread (and therefore less likely to go extinct), then we’d expect to see that the “both” category to have more-positive slopes in their annual range sizes.

That’s kinda? what I see. It’s definitely not a very strong relationship. One issue with this analysis is that you wouldn’t expect any given species in the “both” category to become more prevalent. In fact, it should only be a small fraction of those species that are becomign more widespread and thus contributing to long-term increases in species richness.


Time Series of Local and Regional Richness

localR <- data_all[,list(lR=length(unique(spp))),by=c("reg","stratum","year")]
bothR <- merge(localR, comm_master[,list(reg,year,naive_rich,reg_rich)],by=c("reg","year"))
par(mfrow=c(3,3), cex=1, ps=8, mar=c(2,2,2,2), mgp=c(1,0.2,0), tcl=-0.2)
ureg <- bothR[,unique(reg)]
nreg <- length(ureg)
for(r in 1:nreg){
    bothR[reg==ureg[r],list(mu_lR=mean(lR),naive_rich=naive_rich[1],reg_rich=reg_rich[1]),by=c("reg","year")][,j={
        plot(year, mu_lR, type='l')
        mtext(reg[1], side=3, line=0, adj=0.1, font=2)
        # par(new=TRUE)
        # plot(year, reg_rich, type='l', col='blue', xaxt='n', yaxt='n', xlab='',ylab='')
        par(new=TRUE)
        plot(year, naive_rich, type='l', col='red', xaxt='n', yaxt='n', xlab='',ylab='')
        axis(side=4, col='red')
        NULL
    }]
}

# mtext("Observed Regional (red), MSOM Regional (blue), and Mean Local (black) Richness", side=3, line=-0.75, outer=TRUE, font=2)
mtext("Observed Regional (red) and Mean Local (black) Richness", side=3, line=-0.75, outer=TRUE, font=2)

In general, local richness and regional richness are similar. There are differences, possibly in some cases the regional slope might be significant whereas local would not be (I haven’t checked, though). However, the trends aren’t in opposite directions at the two scales.


All Years All Species Present or Absent

# # dev.new(width=5, height=7)
# # pdf("~/Desktop/pres-abs-allSpp-time.pdf",width=5, height=7)
ureg <- spp_master[,unique(reg)]
for(r in 1:length(ureg)){
    t_table <- spp_master[reg==ureg[r] & present==1, table(spp,year)]
    t_table2 <- t(t_table)#[ncol(t_table):1,]
    t_table3 <- t_table2[,order(colSums(t_table2))]

    par(mar=c(1,5,0.5,1), ps=8, mgp=c(0.75,0.2,0), tcl=-0.15)
    image(t_table3, axes=FALSE)
    # grid(ny=ncol(t_table3)+1, nx=nrow(t_table3)+1)
    abline(h=seq(0,1,length.out=ncol(t_table3)), v=seq(0,1,length.out=nrow(t_table3)), col='gray', lty='dotted', lwd=0.5)
    axis(side=2, at=seq(0,1,length.out=ncol(t_table3)), label=colnames(t_table3), las=1, cex.axis=0.5)
    axis(side=1, at=seq(0,1,length.out=nrow(t_table3)), label=rownames(t_table3))
    text(0.95,0.95, label=ureg[r], font=2)
}
# # # dev.off()


Relationships between alpha, beta, and gamma (MSOM) diversity

localR <- data_all[,list(lR=length(unique(spp))),by=c("reg","stratum","year")]
bothR <- merge(localR, comm_master[,list(reg,year,naive_rich,reg_rich)],by=c("reg","year"))
bothR_mu <- bothR[,list(lR_mu=mean(lR)),by=c("reg","year")]
cm2 <- merge(comm_master, bothR_mu)
eval(figure_setup())
# dev.new(width=3.5, height=7)
par(mfrow=c(3,1), mar=c(1.75,1.5,0.25,0.25),mgp=c(0.85,0.1,0), tcl=-0.1, cex=1, ps=8)
ureg <- cm2[,unique(reg)]
nreg <- length(ureg)
cm2[,j={
    plot(reg_rich, beta_div_mu, col=adjustcolor(pretty_col[reg],0.5), pch=16, xlab="Regional Richness", ylab="Beta Diversity")
    for(r in 1:nreg){
        .SD[reg==ureg[r]][order(reg_rich),j={
            lines(reg_rich,predict(lm(beta_div_mu~reg_rich)),col='black')
        }]
    }
}]

NULL

cm2[,j={
    plot(lR_mu, beta_div_mu, col=adjustcolor(pretty_col[reg],0.5), pch=16, xlab="Average Local Richness", ylab="Beta Diversity")
    for(r in 1:nreg){
        .SD[reg==ureg[r]][order(reg_rich),j={
            lines(lR_mu,predict(lm(beta_div_mu~lR_mu)),col='black')
        }]
    }
}]

NULL

cm2[,legend("topright",ncol=2,legend=pretty_reg[una(reg)],text.col=pretty_col[una(reg)], inset=c(-0.01, -0.03), bty='n', x.intersp=0.15, y.intersp=0.65)]
rect                                 text

1: 32.1 30.3,30.3,30.3,30.3,30.3,45.8, 2: 0.211 0.582,0.549,0.517,0.485,0.453,0.582, 3: 27.5 30.3,30.3,30.3,30.3,30.3,45.8, 4: 0.623 0.582,0.549,0.517,0.485,0.453,0.582,

cm2[,j={
    plot(reg_rich, lR_mu, col=adjustcolor(pretty_col[reg],0.5), pch=16, ylab="Average Local Richness", xlab="Regional Richness")
    for(r in 1:nreg){
        .SD[reg==ureg[r]][order(reg_rich),j={
            lines(reg_rich,predict(lm(lR_mu~reg_rich)),col='black')
        }]
    }
}]

NULL

Relationships between alpha, beta, and gamma (Naive) diversity

localR <- data_all[,list(lR=length(unique(spp))),by=c("reg","stratum","year")]
bothR <- merge(localR, comm_master[,list(reg,year,naive_rich,reg_rich)],by=c("reg","year"))
bothR_mu <- bothR[,list(lR_mu=mean(lR)),by=c("reg","year")]
cm2 <- merge(comm_master, bothR_mu)
eval(figure_setup())
# dev.new(width=3.5, height=7)
par(mfrow=c(3,1), mar=c(1.75,1.5,0.25,0.25),mgp=c(0.85,0.1,0), tcl=-0.1, cex=1, ps=8)
ureg <- cm2[,unique(reg)]
nreg <- length(ureg)
cm2[,j={
    plot(naive_rich, beta_div_mu, col=adjustcolor(pretty_col[reg],0.5), pch=16, xlab="Observed Regional Richness", ylab="Beta Diversity")
    for(r in 1:nreg){
        .SD[reg==ureg[r]][order(naive_rich),j={
            lines(naive_rich,predict(lm(beta_div_mu~naive_rich)),col='black')
        }]
    }
}]

NULL

cm2[,j={
    plot(lR_mu, beta_div_mu, col=adjustcolor(pretty_col[reg],0.5), pch=16, xlab="Average Local Richness", ylab="Beta Diversity")
    for(r in 1:nreg){
        .SD[reg==ureg[r]][order(naive_rich),j={
            lines(lR_mu,predict(lm(beta_div_mu~lR_mu)),col='black')
        }]
    }
}]

NULL

cm2[,legend("topright",ncol=2,legend=pretty_reg[una(reg)],text.col=pretty_col[una(reg)], inset=c(-0.01, -0.03), bty='n', x.intersp=0.15, y.intersp=0.65)]
rect                                 text

1: 32.1 30.3,30.3,30.3,30.3,30.3,45.8, 2: 0.211 0.582,0.549,0.517,0.485,0.453,0.582, 3: 27.5 30.3,30.3,30.3,30.3,30.3,45.8, 4: 0.623 0.582,0.549,0.517,0.485,0.453,0.582,

cm2[,j={
    plot(naive_rich, lR_mu, col=adjustcolor(pretty_col[reg],0.5), pch=16, ylab="Average Local Richness", xlab="Observed Regional Richness")
    for(r in 1:nreg){
        .SD[reg==ureg[r]][order(naive_rich),j={
            lines(naive_rich,predict(lm(lR_mu~naive_rich)),col='black')
        }]
    }
}]

NULL


Richness-range size, varying long-term and annual, all spp and transients only

By transient I mean species that weren’t present in every year.

noNeither <- spp_master[,j={
    td <- .SD[ce_categ!="neither" & present==1]
    td1.0 <- td[,list(propStrata_noNeither_ltAvg=mean(propStrata)),keyby=c("reg","spp")]
    setkey(td, reg, spp)
    td1.1 <- td[td1.0]
    td1.2 <- td1.1[,list(propStrata_noNeither_avg_ltAvg=mean(propStrata_noNeither_ltAvg)),by=c("reg","year")]
    td2 <- td[,list(propStrata_noNeither_avg=mean(propStrata)),by=c("reg","year")]
    td_out <- merge(td1.2, td2, by=c('reg','year'))
}]
cmNN <- merge(comm_master, noNeither)

par(mfrow=c(2,2))
ureg <- comm_master[,unique(reg)]
nreg <- length(ureg)
comm_master[,j={
    plot(propStrata_avg_ltAvg, reg_rich, col=adjustcolor(pretty_col[reg],0.5), pch=16)
    for(r in 1:nreg){
        .SD[reg==ureg[r]][order(propStrata_avg_ltAvg),j={
            lines(propStrata_avg_ltAvg, predict(lm(reg_rich~propStrata_avg_ltAvg)))
        }]
    }
}]

NULL

comm_master[,j={
    plot(propStrata_avg, reg_rich, col=adjustcolor(pretty_col[reg],0.5), pch=16)
    for(r in 1:nreg){
        .SD[reg==ureg[r]][order(propStrata_avg),j={
            lines(propStrata_avg, predict(lm(reg_rich~propStrata_avg)))
        }]
    }
}]

NULL

cmNN[,j={
    plot(propStrata_noNeither_avg_ltAvg, reg_rich, col=adjustcolor(pretty_col[reg],0.5), pch=16)
    for(r in 1:nreg){
        .SD[reg==ureg[r]][order(propStrata_noNeither_avg_ltAvg),j={
            lines(propStrata_noNeither_avg_ltAvg, predict(lm(reg_rich~propStrata_noNeither_avg_ltAvg)))
        }]
    }
}]

NULL

cmNN[,j={
    plot(propStrata_noNeither_avg, reg_rich, col=adjustcolor(pretty_col[reg],0.5), pch=16)
    for(r in 1:nreg){
        .SD[reg==ureg[r]][order(propStrata_noNeither_avg),j={
            lines(propStrata_noNeither_avg, predict(lm(reg_rich~propStrata_noNeither_avg)))
        }]
    }
}]

NULL


Range Size Over Time For Community vs Transients

rE_logic1 <- bquote(reg==rr & ce_categ!="neither" & present==1)
rE_logic2 <- bquote(reg==rr & ce_categ!="neither")
rE_logic3 <- bquote(reg==rr & ce_categ=="neither")
colQ <- bquote(adjustcolor(c("pre_ext"="red","post_col"="blue")[stretch_type],0.5))
colQ2 <- bquote(adjustcolor(c("pre_ext"="red","post_col"="blue")[stretch_type],1))

# dev.new(width=6.5, height=6.5)
par(mfrow=c(3,3), mgp=c(0.85,0.2,0), ps=8, cex=1, mar=c(1.75,1.75,0.5,0.5), tcl=-0.15)
ur <- spp_master[,unique(reg)]
for(r in 1:length(ur)){
    rr <- ur[r]
    rEl3 <- spp_master[order(year)][eval(rE_logic3)]
    rEl2 <- spp_master[order(year)][eval(rE_logic2)]
    rEl1 <- rEl2[order(year)][eval(rE_logic1)]
    u_spp <- rEl1[, unique(spp)]

    rEl1[,j={bp_dat <<- boxplot(propStrata~year,plot=FALSE);NULL}]
    bp_ylim <- unlist(bp_dat[c("stats")], use.names=FALSE)
    medRange_pres0 <- rEl2[,list(propStrata=median(propStrata)),by='year']
    medRange_noNeith <- rEl3[,list(propStrata=median(propStrata)),by='year']
    rEl1_qylim <- rEl1[,quantile(propStrata,c(0.25,0.75)),by='year'][,range(V1)]
    rEl3_qylim <- rEl3[,quantile(propStrata,c(0.25,0.75)),by='year'][,range(V1)]
    ylim <- range(c(
        bp_ylim,
        rEl1_qylim,
        rEl3_qylim,
        # medRange_pres0[,propStrata],
        medRange_noNeith[,propStrata]#,
    ))
    rEl1[,j={boxplot(propStrata~year, add=FALSE, at=unique(year), outline=FALSE, axes=TRUE, ylim=ylim); NULL}]
    
    
    lines(rEl1[,median(propStrata),by='year'], lwd=2, col='red')
    lines(rEl1[,quantile(propStrata,0.75),by='year'], lwd=1, col='red')
    lines(rEl1[,quantile(propStrata,0.25),by='year'], lwd=1, col='red')
    
    lines(rEl3[,median(propStrata),by='year'], lwd=2, col='blue')
    lines(rEl3[,quantile(propStrata,0.75),by='year'], lwd=1, col='blue')
    lines(rEl3[,quantile(propStrata,0.25),by='year'], lwd=1, col='blue')

    comm_master[reg==rr, lines(year,propStrata_avg_ltAvg, col='black')]
    mtext(pretty_reg[rr], line=-0.75, side=3, adj=0.1, font=2)
}


Time Series of Range Size, Color is Local Richness

rE_logic1 <- bquote(reg==rr & ce_categ!="neither" & present==1)
rE_logic2 <- bquote(reg==rr & ce_categ!="neither")
rE_logic3 <- bquote(reg==rr & ce_categ=="neither")
colQ <- bquote(adjustcolor(c("pre_ext"="red","post_col"="blue")[stretch_type],0.5))
colQ2 <- bquote(adjustcolor(c("pre_ext"="red","post_col"="blue")[stretch_type],1))

# dev.new(width=6.5, height=6.5)
par(mfrow=c(3,3), mgp=c(0.85,0.2,0), ps=8, cex=1, mar=c(1.75,1.75,0.5,0.5), tcl=-0.15, oma=c(0.25,0.1,0.1,0.1))
ur <- spp_master[,unique(reg)]
for(r in 1:length(ur)){
    rr <- ur[r]
    rEl3 <- spp_master[order(year)][eval(rE_logic3)]
    rEl2 <- spp_master[order(year)][eval(rE_logic2)]
    rEl1 <- rEl2[order(year)][eval(rE_logic1)]
    u_spp <- rEl1[, unique(spp)]
    
    nCols <- rEl1[,length(unique(year))]
    cols <- viridis(nCols)
    nTrans <- rEl1[,colSums(table(spp,year))]
    nTrans <- nTrans[order(as.integer(names(nTrans)))]
    colVec_ind <- cut(nTrans, breaks=nCols)
    colVec <- cols[colVec_ind]
    
    rEl1[,j={bp_dat <<- boxplot(propStrata~year,plot=FALSE);NULL}]
    bp_ylim <- unlist(bp_dat[c("stats")], use.names=FALSE)
    ylim <- range(c(
        bp_ylim
    ))
    rEl1[,j={boxplot(propStrata~year, add=FALSE, at=unique(year), col=colVec, outline=FALSE, axes=TRUE, ylim=ylim); NULL}]
    if(rr=="sa"){
        mapLegend(x=0.3, y=0.78, zlim=range(nTrans),cols=cols)
    }else{
        mapLegend(x=0.05, y=0.78, zlim=range(nTrans),cols=cols)
    }
    mtext(pretty_reg[rr], line=-0.75, side=3, adj=0.1, font=2)
}
mtext("Range Size of Transient Species", side=2, line=-0.75, outer=TRUE, font=2)
mtext("Year", side=1, line=-0.75, outer=TRUE, font=2)


Time Series of Tows per Site

par(mfrow=c(3,3), mar=c(4,3.5,1,1))
ureg <- data_all[reg!='wcann',unique(reg)]
nreg <- length(ureg)
for(r in 1:nreg){
    data_all[reg==ureg[r],list(Kmax=unique(Kmax)),by=c("reg","year","stratum")][,j={boxplot(Kmax~year, main=reg[1]);NULL},by='reg']
}
mtext("Tows per site", side=2, line=-1.25, outer=TRUE)